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I. INTRODUCTION 


A. NOISE HINDERS SIGNAL CLASSIFICATION 

Ambient ocean noise can significantly reduce the effectiveness of underwater signal 
classification, both for the human listener and for automatic classifiers such as artificial 
neural networks. While performance can be improved with adequate training, many 
transients are rare and occur only in specific locations. This makes it difficult and 
expensive (if not impossible) to collect a sufficiently large number of examples to perform 
sufficient training. 

Artificial Neural Networks are particularly well suited for pattern classification prob¬ 
lems, and their application to underwater transient signal classification continues to be 
an active area of research. When sufficient training signals are not available however neu¬ 
ral networks can classify according to false clues. In particular, it is known that in low 
signal to noise conditions a neural network may classify signals based on characteristics 
of the noise rather than characteristics of the signal of interest. [1] 

Human listeners are similarly plagued by a number of factors which can impair their 
ability to classify signals in the presence of noise. One of the major problems is that to 
some extent human listeners hear what they expect to hear. This is especially true when 
the signal of interest is embedded in noise. As an example, [2] describes an experiment 
where subjects listened repeatedly to a particular piece of choral music until they knew 
it very well. The music was then played back to them in the presence of white noise. The 
signal-to-noise ratio (SNR) was gradually decreased until there was only white noise with 
no music. All of the subjects thought they heard the music playing for a considerable 
time after the music had been completely turned off. Other physiological effects, such as 




masking, distortion, and auditory fatigue can also impair a human listener’s effectiveness 
as a classifier [3]. Masking, which is when a signal at one frequency seems to hide a signal 
at another frequency, is extremely important in considering what is heard because it can 
significantly reduce what one actually hears — the listener could become completely 
unaware of the fainter frequency components of a complex sound embedded in noise and 
thus may miss distinguishing characteristics vital to accurate classification [2]. In some 
cases, problems such as these can be partially overcome with sufficient training, so that 
the listener learns, through repetition, to discriminate certain signals in the presence of 
certain noises [3]. 

B. NOISE HINDERS SIGNAL SYNTHESIS 

One method to improve the effectiveness of classification (both for human listeners 
and artificial neural networks) is to expand the training data. When a sufficient quantity 
of real data is not available the training data can be expanded by including synthetically 
generated signals which have all of the significant characteristics of real signals. ARMA 
modeling techniques have been used both to model transient signals with high precision, 
and to develop models which are very much like the original, yet distinct. However, 
the synthesis process typically yields poor results when the signal is corrupted by even 
a small amount of ambient background noise [4]. The reason for this degradation is 
that ambient noise, being a broadband process, can only be accurately modeled with 
high model orders. Thus, in order to accurately model a narrowband process in the 
presence of wideband noise, excess poles are required [5]. Good models can normally be 
obtained in the presence of noise if the model order is chosen high enough to account for 
the broadband nature of the noise. The inclusion of excess poles however can produce 
undesirable distortions when synthesising new transients from real transient signals [6], 
The synthesis is usually bad enough that a human listener can immediately distinguish 
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between a real signal and a synthetically generated signal. Similarly, the distortions may 
introduce false clues to artificial neural networks trained with these synthetic signals, 
and thus cause poor classification performance in a real environment. 

C. IMPROVEMENTS THROUGH NOISE MODELING AND 
FILTERING 

Removing noise prior to modeling transients has been shown to greatly improve the 
ability to generate high quality synthetic training data [6]. Because ambient noise is 
always present in the ocean environment, once a high quality synthetic signal has been 
generated, realistic synthetic noise must be added in order to give the synthetic signal 
convincingly real characteristics. Not only must the noise sound real, but a variety of 
noise sources must be generated so that the classifier learns to classify the signal correctly 
in the presence of these different sources. The modeling of ocean noise is an extremelv 
important part of the overall process of sonar data modeling and synthesis, since the 
models are necessary to develop appropriate filters for removing noise from the signals of 
interest before modeling those signals, and the models provide ways to synthesize noise 
which then can be added to the synthesized signal data in a controlled manner. 

This thesis concentrates on two aspects of noise. The first part focuses on the 
sources of ambient ocean noise and how best to model the noise. Autoregressive (AR), 
moving-average (MA), and autoregressive moving-average (ARMA) models are compared 
in terms of simplicity of implementation and quality of reproduction. The comparisons 
include analysis of the noise spectrums and the model spectrums and subjective listening 
tests. This modeled noise can be added to synthetically generated transients to produce 
extremely realistic synthetic transient signals. 

The second part of this thesis is directed at developing techniques to remove the noise 
from digitally recorded signals. Since the signals embedded in the noise are non-stationary, 






a non-stationary approach is required. We chose a short-time application of the well 
known Wiener filter because the Wiener filter is an optimal filter for removing noise 
from a signal. One of the main difficulties in implementing the Wiener filter, however, 
is that design of this filter requires a knowledge of the signal autocorrelation function 
and the noise autocorrelation function, and generally assumes that the signal and noise 
are stationary. Since the respective autocorrelation functions are usually unknown, they 
must be estimated from the data, with the non-stationarity of the transient, signal taken 
into account. We applied a number of techniques to make the Wiener filter perform 
satisfactorily in a short-time application with real signals. These techniques include pre¬ 
whitening of the signals, filtering short (approximately stationary) signal segments, and 
overlap averaging of filtered segments. 

D. THESIS OUTLINE 

The remainder of this thesis is organized as follows. Chapter II discusses the nature 
of ambient noise and some current models. Results of tests for Gaussianity of the real 
noise samples examined in this thesis are also presented here. Chapter III describes 
three modeling techniques, AR, MA, and ARMA modeling, and compares them in the 
noise modeling application. Chapter IV introduces a noise removal technique which 
uses a novel implementation of the Wiener filter, and presents results. Chapter V lists 
conclusions and suggests possibilities for future research. 
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II. AMBIENT OCEAN NOISE 
CHARACTERIZATION 


Ambient ocean noise studies are typically concerned with being able to predict the 
noise level at certain bandwidths under various climatic and traffic conditions for the 
purpose of target detection or localization. This thesis analyzes noise from the perspec¬ 
tive of trying to recreate it in order to 1) develop methods for filtering the noise from 
real signals and 2) improve the quality of synthetically generated signals by adding syn¬ 
thetically generated noise to these signals. Proper characterization of the recorded noise 
data is essential for the development of the best statistical models. 

The definition of ambient noise varies with the application. Generally, it excludes 
all noise sources which are identifiable. One of these types of identifiable noise sources is 
self noise. This is any type of noise caused by the recording process, and includes noise 
from the recording platform such as machinery noise, electronic noise in the amplifier of 
the receiver, the noise caused by the flow of water over the hydrophone and its support 
structure, and related sources [7, 8]. Because we do not have control of the recording 
environment our definition is somewhat more broad; we define ambient noise to include 
all stochastic signals which interfere with the signal of interest, including some noise 
sources which might be considered identifiable. 

There are two reasons for adopting the above definition. First, all noise interferes 
with the reception of the transient signal of interest, and therefore should be filtered out 
of the signal in order to obtain the best reproduction of the original transient signal. 
Secondly, synthetic ambient noise will not adequately represent the noise found with the 
recorded data if it. does not include all sources of noise present in a real environment. 
For example, if a sonar operator only hears real signals in the presence of flow noise 



and machinery noise (two forms of self noise), then these types of noise must also be 
included in synthetically generated ambient noise to be added to a synthetically generated 
transient signal so that the synthetic signal will sound exactly like a real signal. 

A. SOURCES 

The broadband spectrum of ambient ocean noise shows different characteristics at 
different frequencies, indicating the presence of a variety of noise sources. Since there are 
several regions of the spectrum where one type of noise source dominates, the spectrum 
has been typically divided into five somewhat arbitrary frequency bands, each with differ¬ 
ent prevailing noise sources. The following sections briefly describe some of the primary 
sources of ambient ocean noise in each region. The descriptions below are summarized 
from [7, 8], each of which contains a more thorough description of the sources. 

1. Ultra-Low Band 

This band covers all frequencies less than 1 Hz. It is one of the least significant in 
terms of underwater sound studies, and is therefore the least studied and understood. The 
dominant sources in this region include tides, hydrostatic effects of waves, and seismic 
disturbances. Valid measurements in this frequency range are difficult because of the 
various sources of self noise caused by the interaction of the hydrophone and its supporting 
structure with ocean currents, as well as the pyroelectric effect on the hydrophone as tidal 
currents cause changes in the ocean temperature in the vicinity of the hydrophone. 

2. Infrasonic Band 

This band, which covers the frequencies from 1 Hz to 20 Hz, is of considerable 
interest for low frequency passive sonar applications because it contains the blade-rate 
fundamental frequency of propeller-driven vessels. Thus, one of the major noise sources in 

6 



this region is the noise from shipping. In the absence of shipping noise, which dominates 
all other sources, the noise spectrum in this band depends primarily on wind speed and 
the level of seismic activity. Self noise caused by the the flow of water over the hydrophone 
and its supporting structure is also evident in this frequency band. Another source in this 
region, which is not as significant, is oceanic turbulence in the form of irregular random 
water currents. 

3. Low Sonic Band 

This band covers the frequency decade from 20 Hz to 200 Hz, and is dominated 
by the noise from distant shipping, especially in the frequencies around 100 Hz. It 
includes sources as far away as 1000 miles. The noise from distant storms competes with 
the noise from shipping. In the absence of either of these noise sources, the ambient noise 
level is determined by the local wind speed. 

4. High Sonic Band 

This band covers the frequencies from 200 Hz to 50 kHz. The noise level in this 
band depends almost exclusively on the speed of the local wind, especially at frequencies 
above one kHz. The exact cause of the noise is still uncertain, but it is believed to be 
caused by a number of interactions of the wind with the water surface. These interac¬ 
tions include the breaking of whitecaps, flow noise of the wind over the water surface, 
the collapse of air bubbles (cavitation) caused by turbulent wave action, and the wave 
generating action of the wind. 

5. Ultrasonic Band 


Thermal noise, which is the noise of molecular bombardment, tends to domi¬ 
nate this band, which covers all frequencies above 50 kHz. This frequency band is not 



recorded digitally 


significant for our purposes since our signals are band limited and were 
with a sampling frequency much less than 50 kHz. 

6. Other Sources 

In addition to the noise sources which dominate each of the frequency regions 
described above there are intermittent sources which can cover more than one frequency 
band and be persistent enough at times to be considered part of the ambient noise. 
Biologic noise includes the sounds produced by shellfish, marine mammals, and fish. 
Non-biologic noise sources include rain, earthquakes, underwater explosions, volcanos, 
and surf effects. All of these sources are of only limited duration and are therefore not 
normally considered as part of the ambient background, even though at times they may 
dominate all other sources. [7] 

The ambient noise generated in the arctic has some unique characteristics be¬ 
cause of the ice cover, yielding a different noise environment than described above. In 
addition to the noise sources described above, ambient noise sources in this region include 
cracking ice, wind over the ice, moving ice masses colliding with each other, and waves 
impacting ice masses. [7] 

B. CHARACTERISTICS 

Along with the sources of the ambient noise, there are a number of other con¬ 
siderations which effect the qualities of ambient ocean noise. These characteristics are 
important to take into account in the development of good noise models. 

1. Stationarity 

Ambient ocean noise is not stationary because the individual noise sources which 
contribute to the total noise background are not stationary. For example, biologic activity 



is dependent on both the time of day and the season of the year. The noise from shipping 
similarly varies depending on the time, season, and location. Likewise, noise from the 
wind and storms is strongly dependent on the local atmospheric conditions. The vari¬ 
ability of the ambient noise background over various time intervals is discussed in detail 
in [7]. Although this variability is real and needs to be taken into account for predict¬ 
ing the noise level at a particular time of the day, we can reasonably assume that the 
noise is stationary over the short time intervals (several seconds) applicable to the meth¬ 
ods described in this thesis since none of the dominant ambient noise sources changes 
significantly over these short time intervals. 

2. Directionality 

The strength and spectral shape of the ambient ocean noise also depends on the 
direction of arrival. With the use of vertical arrays it has been found that shipping noise 
is highly directional in the horizontal direction, while noise from surface effects, such as 
wind and rain, are stronger from the vertical direction than from the horizontal direction 
[7], Horizontal arrays have similarly been used to show that the noise level in a horizontal 
plane is determined by such effects as the direction of the wind and swells, as well as the 
direction of individual ships which contribute to the noise field [7], In general we do not 
know the directional characteristics of the recording equipment used to collect the noise 
samples analyzed in this thesis, so the models we develop do not take the directionality 
of the noise into account. 

3. Depth Dependence 

Deep water noise has relatively well-defined levels and is predictable. Shallow 
water noise however is subject to much wider variations, both in time and location. In 
deep water the noise level at any given frequency is predominantly from one type of 



source, but in shallow water the noise at any frequency is a combination of shipping and 
industrial noise, wind noise, and noise from biologies. The combination of these three 
sources is variable and unpredictable, making accurate prediction extremely difficult. [8] 

4. Propagation Effects 

Sound transmission in the ocean is effected by the water temperature and pres¬ 
sure, with the sound rays bending toward regions of lower temperature and lower pressure. 
Sound can also reflect off of the sea surface and sea floor. These effects lead to various 
preferential transmission paths. One of these paths is the deep sound channel, in which 
low frequency sound travels horizontally with little attenuation. A receiver located in 
the deep sound channel will detect more noise than a receiver located below the channel. 
Similarly, noise from the sea surface can be trapped in the surface duct and travel nearly 
horizontally to distant locations. When the conditions are right, both the deep sound 
channel and the surface duct are likely to be noisy environments. The propagation of 
sound is also dependent on frequency, with high frequencies being attenuated much more 
than low frequencies. This causes the ocean to behave as a low pass filter. [7] 

C. STATISTICS 

The most important aspect of the noise from the standpoint of modeling and fil¬ 
tering is its statistical characteristics. Ambient ocean noise, which is comprised of many 
individual sources, is typically assumed to have a Gaussian distribution over short peri¬ 
ods of time [7]. This assumption is based on the Central Limit Theorem , which states 
that the probability distribution of a sum of a very large number of random variables ap¬ 
proaches a Gaussian distribution [9]. This assumption however is not valid in every case, 
particularly in the presence of dominant non-Gaussian noise sources (such as impulsive 
noise) [10, 11]. Testing the noise to see if truly it has a Gaussian amplitude distribution 
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is important in determining the most suitable techniques for modeling and filtering the 
noise. If the noise has a Gaussian distribution, then linear methods will yield optimal 
results [12]. For processes with non-Gaussian distributions, non-linear techniques are 
generally required in order to obtain optimal results. 

We tested both real and synthetically generated noise sources using a variety of 
statistical methods. The synthetic noise sequences were generated using Matlab pl| and 
were included to provide a “benchmark” with which to compare the results from the real 
data. The real data used was collected from a variety of underwater transient sources. In 
all cases the data sequences consisted only of background noise extracted before onset of 
the actual transient signal. The noise sequence labeled orca is from a recorded biologic (a 
killer whale). The tk-lOa noise is from ocean data distributed by DARPA to contractors 
as part of an early test of classification algorithms. The noise sequence bio-2385 is part 
of the NUSC TAP06 data. Finally, the sequences wxOl, qx03 , and fx09 are ambient noise 
from sounds recorded during range tests. 

1. x 2 Test for Gaussianity 

It would be extremely difficult to absolutely prove that a sequence has a Gaus¬ 
sian distribution because no matter how much evidence is collected there always exists 
the possibility that it does not. We can relatively easily prove that a sequence does not 
have a Gaussian distribution however. This is how the \' 2 test is used as a means of 
testing for Gaussianity. A null hypothesis (Ho) that the sequence has a Gaussian ampli¬ 
tude distribution is formulated. The alternate, and mutually exclusive, hypothesis i H-„) 
is that the sequence does not have a Gaussian amplitude distribution. We accept H 0 
as being true only if there is insufficient evidence to reject H\. The decision to accept 
or reject the null hypothesis at a certain level of significance is based on comparing the 
computed x 2 test statistic , a weighted summation of the difference between the observed 
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and expected probability distributions of the data, with a pre-determined critical valve. 
The null hypothesis is accepted (at the desired level of significance) if the test statistic 
is less than the critical value. [14, 15] 

To perform the test the data are first categorized into r amplitude cells. The 
number and size of the cells is flexible and typically depends on the application, but in 
general one should choose as many cells as possible in order to increase the number of 
degrees of freedom, with the condition that the expected number of occurrences in each 
cell should be at least five [16, 14]. Since the data will be distributed among the r cells 
according to the probability distribution function of the data, the number of degrees of 
freedom v is r — 1 (the number of occurrences in cell r is determined by the number 
of occurrences in the previous r — 1 cells). If the expected cell probabilities depend on 
unknown parameters which must be estimated from the data, then an additional degree of 
freedom must be subtracted for each unknown parameter [15]. Since a Gaussian process 
is completely characterized by its mean and variance, these are the only parameters which 
must be estimated from the data to test for a Gaussian fit. Thus, for all of our sequences 
the number of degrees of freedom is v = r — 1 — 2 = r - 3. 

The test statistic W is computed by forming a weighted sum of the squared 
differences between the actual number of occurrences in each cell and the number of 
occurrences in each cell which would be expected of a Gaussian sequence. This weighted 
sum is given by 

W = £ ( * i ~. ei)2 ( 2 . 1 ) 
where r is the number of amplitude cells, Ni is the actual number of occurrences in cell 
*, and e< is the observed number of occurrences in cell i. If the data sequence represents 
a multinomial experiment of N, independent trials, then as N, approaches infinity the 
distribution W approaches a \ 3 distribution with v degrees of freedom. [15] 
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In order to obtain a meaningful value for the test statistic we used, when pos¬ 
sible, N, = 20000 points of the noise sequence. This usually allowed us to form as many 
as 25 equally spaced amplitude cells and still maintain e< > 5 for all but one or two cells 
at each tail. The cells at either tail with e; < 5 were combined so that all cells had 
e> > 5 [15]. We chose a level of significance of 0.05, which represents the risk involved in 
wrongly deciding to reject the null hypothesis (the probability that the null hypothesis 
was incorrectly rejected). 

Table 2.1 shows the results of the y 2 test for a number of different synthetically 
generated noise sequences. In each case the number of degrees of freedom v was chosen 


Noise Source 

1 r 

v | W j Xo.os.v 1 Gaussian? | 

Gaussian 

20000 

23 

20 

18.82 

31.41 

yes | 

uniform 

20000 

25 

22 

4874.0 

33.92 

no 

12 uniforms 

20000 

23 

20 

12.70 

31.41 

yes | 

Laplacian 

20000 

15 

12 

3619.4 

21.03 

no 

exponential 

20000 

li 

8 

16629 

15.51 

no | 

Bernoulli-Gaussian 

20000 

21 

18 

338.7 

28.87 

no 


TABLE 2.1: RESULTS OF THE y 2 TEST FOR GAUSSIAMTY USING SYNTHETI¬ 
CALLY GENERATED NOISE SEQUENCES. 


as three less than r, the number of cells with e< > 5. The Gaussian and uniform white 
noise sequences were generated using the intrinsic Matlab functions “randn” and “rand.” 
respectively. The Laplacian, exponential, and Bernoulli-Gaussian (with event probability 
0.95) noise sources were generated using the “rpiid” function from the Matlab “Hi-Spec" 
toolbox [17]. Notice that only the Gaussian sequence and the sequence formed by sum¬ 
ming twelve independent realizations of 20000 points of uniform noise passed the \ 2 test 
for Gaussianity at the 0.05 significance level. 

The test was then performed on the real noise sequences described earlier. The 
results are listed in Table 2.2. With the exception of the sequence orca all of the recorded 




noise sequences passed the test well within the five percent significance level, providing 
strong evidence that these noise sequences have distributions which do not significantly 
depart from Gaussianity. The sequence orca is apparently not Gaussian, however its 
deviation from Gaussianity does not appear to be strong. 


Noise Source 

N, 

r 

« 

W 

Ao.os.v 

Gaussian? 

orca 

12000 

21 

18 

31.56 

28.87 

no 

tk-lOa 

20000 

23 

20 

17.12 

31.41 

yes 

bio-2385 

20000 

21 

18 

12.99 

28.87 

yes 

wxOl 

20000 

21 

18 

12.94 

28.87 

yes 

qxOS 

20000 

23 

20 

11.13 

31.41 

yes 

}x09 

20000 

21 

18 

10.20 

28.87 

yes 


TABLE 2.2: RESULTS OF THE x 2 TEST FOR GAUSSIANITY USING RECORDED 
OCEAN NOISE. 

The \ 2 test has some important limitations. One of these limitations is that 
the distribution of the discrepancies between the observed and expected frequencies is 
not taken into account, even though this distribution may influence how well the data 
appear to fit the theoretical curve. In other words, we would expect that the difference 
Ni — should alternate randomly between positive and negative values. If however, the 
differences are positive for all cells higher than the mean and negative for all cells lower 
than the mean, then the data would be less likely to have a Gaussian distribution, even 
though the \ 2 test statistic gives a satisfactory result. [18] 

Another important limitation of the x 2 test is that it considers only the marginal 
first and second order statistics (the mean and the variance) of the sequence, which is 
insufficient to prove joint. Gaussianity [19]. Other more powerful tests such as those 
discused in the next section are sensitive to joint properties of the data. 
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2. Gaussianity Tests Based on Higher-Order Statistics 

Tests based on higher order statistics (HOS) can yield additional insight since 
they are not necessarily subject to the same limitations as the y 2 test. Classical tests 
are based on simple calculations of the marginal third and fourth order moments , while 
modern tests are based on the third and fourth order cumulants. These modern tests yield 
more information, but typically require significantly more calculations than classical tests, 
and have thus only lately gained increased attention because of the recent improvements 
in the computational capabilities of modern computers. 

a. Classical Tests 

Two classical tests are the measures of the coefficient of skewness and the 
coefficient of kurtosis of a random sequence. The coefficient of skewness is a measure of 
the symmetry of the density function about the mean. A Gaussian density function, and 
all ot her symmetric density functions, have zero skewness. For this reason the coefficient 
of skewness by itself is not a valid test for Gaussianity of a data sequence. Its usefulness is 
in showing that a density function is not Gaussian since non-zero skewness is incompatible 
with Gaussianity. The coefficient of kurtosis, which is a measure of peakedness of the 
density function, can provide additional insight. Its value is zero for a Gaussian density 
function. A density function with a sharper peak (such as the Laplacian density) has a 
positive coefficient of kurtosis and is called leptokurtic , while a density function with a 
negative coefficient of kurtosis has a flatter peak and is called platykurtic. [15] 

The coefficients of skewness and kurtosis rely on the calculations of the 
marginal third and fourth order central moments about the mean of the sequence (Since 
only the marginal statistics are considered, as with like the x 2 test, these tests are 



insufficient to prove joint Gaussianity.) The k th central moment about the mean of a 
random variable X is defined as 


fi k = €[{X - m x )‘] (2.2) 

where mx is the mean of the random variable and £[•] represents the expectation oper¬ 
ation. These statistical parameters are generally unknown, so they must be estimated 
from the data. The sample mean is estimated by 

fax = jj- £ x («) (2-3) 

while the sample variance of the data, which is the second central moment about the 
mean, is estimated by 

°x = £ [*(") - fax? (2- 4 ) 

from which the sample standard deviation ax can be obtained. Using (2.3), the sample 
third and fourth central moments about the mean are then estimated by 

^3 = Y E [*(") “ (2.5) 

£4 = Jj- E M n ) - fa*? ( 2 - 6 ) 

Finally, using (2.3), (2.4), (2.5), and (2.6), the coefficient of skewness 03 and coefficient 
of kurtosis o 4 are estimated by [15] 

63 

a* 

The test yields qualitative rather than quantitative results since no test statis¬ 
tic is computed as with the \ 2 test. A sequence can be considered non-Gaussian if either 
quantity is not close to zero. Table 2.3 lists the results of these tests for the same test 
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(2.7) 

( 2 . 8 ) 



Noise Source 

n 

skewness (q 3 ) 

kurtosis (q 4 ) 

Gaussian 

20000 

-0.0024 

0.0122 

uniform 

20000 

0.0024 

1.8002 

12 uniforms 

20000 

-0.0020 

-0.0149 

Laplacian 

20000 

0.0118 

2.9975 

exponential 

20000 

1.9386 

5.3594 

Bernoulli-Gaussian 

20000 

0.0155 

0.1912 


TABLE 2.3: RESULTS OF THE TESTS OF SKEWNESS AND KURTOSIS USING 
SYNTHETICALLY GENERATED NOISE SEQUENCES. 

noise sequences as those listed in Table 2.1. All of the sequences except the exponen¬ 
tial sequence have small coefficients of skewness, indicating that only the exponential 
sequence fails the test for Gaussianity based on symmetry properties. The other non- 
Gaussian sequences (uniform, Laplacian, and Bernoulli-Gaussian) passed the test of zero 
skewness but failed the test of Gaussianity based on their respective coefficients of kur- 
tosis, illustrating the importance of performing both tests. Only the Gaussian and the 
twelve uniform sequences added together passed the test for Gaussianity based on both 
the coefficient of skewness and the coefficient of kurtosis. 

Table 2.4 lists the results for the real noise sequences. All of these sequences 
except orca have coefficients of skewness and kurtosis close to zero, giving strong evidence 
that the sequences may be Gaussian. The sequence orca is possibly nori-Gaussian since 
its coefficient of kurtosis is relatively large. This sequence is the only one that also did 
not pass the \ J test at the 0.05 significance level. 

b. Tests Based on Higher Order Cumulants 

Characterization of random processes using higher-order moments is a field 
which has recently been given much attention. Cumulants , which are related to moments, 
can be used for characterizing random processes, and are particularly useful as a measure 





| Noise Source [ n 

skewness (dg) 

kurtosis (d 4 ) | 

area 

12000 1 

0.0055 

0.1017 

tk-lOa 

20000 

0.0042 

-0.0504 

bio-2385 

20000 

0.0044 

-0.0274 

wxOl 

20000 

-0.0028 

-0.0333 

qx03 

20000 

-0.0007 

-0.0069 

fx09 

20000 

-0.0111 

0.0507 


TABLE 2.4: RESULTS OF THE TESTS OF SKEWNESS AND KURTOSIS USING 
RECORDED OCEAN NOISE. 

of departure from Gaussianity since all cumulants of order higher than two are identically 
zero for a Gaussian random process. If x is a real zero-mean random process the first 

four cumulants are 

= £[x(n)]=m, = 0 (2.9) 

Ci 2) (h) = f[r(n)i(n + li)] = i?,(I,) (2.10) 

Ci 3) (h.l 2 ) = E[x(n)x(n + h)x(n + k)} (2.11) 

Ci 4) (h.hJz) = SlxinMn + hWn + hjxin + kfi-CWlhKfHk-h) (2.12) 

- C^Xl^C^Xh - h) - Ci 2) (h)Ci 2) (h - h) 

where the arguments and I3 are referred to as “lag values.” Note that if the se¬ 
quence x has unit variance then for li,hJ3 = 0 (2.11) and (2.12) reduce to (2.7) and 
(2.8). respectfully. Since the third order cumulant is related to the skewness of the den¬ 
sity function, tests based only on the third order cumulant are insufficient to completely 
determine Gaussianity for the same reason that the coefficient of skewness is an inade¬ 
quate test when performed by itself. The fourth order cumulant, like the coefficient of 
kurtosis, should also be considered when testing for Gaussianity. Cumulants of order 
higher than four are not typically used since they are difficult to compute reliably and 
thus have little practical application. [12] 
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The power spectrum of a sequence can be obtained as the one-dimensional 
Fourier transform of the second order cumulant (which is identical to the covariance for 
a zero mean process). Higher-order spectra, or polyspectra, can be similarly obtained as 
multidimensional Fourier transforms of higher-order cumulants. In particular, the two- 
dimensional Fourier transform of the third-order cumulant is called the bispectrum , and 
is expressed as 

B m (ui,u, 2 )= £ £ (2.13) 

For a real random process the bispectrum has twelve regions of symmetry, so the bispec¬ 
trum needs to be computed only over the triangular non-redundant region [20] 

ft = {0 < ui < t. w 2 < wi, 2u>i + lot < 2ff} (2.14) 

The rest of the bispectral plane is related to points in this region through symmetry 
properties and thus contains no new information. [12] 

The normalized bispectrum (or bicoherence ) is 



where 5,(w,-) is the power spectrum of the sequence x at frequency w*. Using (2.15) the 

average skewness of the sequence is 

63 = fj BW{wi,w 2 ) du t dw 2 (2.16) 

where the integration is performed over the non-redundant region defined by (2.14). This 
relationship is the basis for using the estimated bicoherence of the sequence to compute 
a test statistic which gives the probability of falsely rejecting the hypothesis of zero 
skewness. [20] 

The Matlab function “GLSTAT” from the “Hi-Spec” toolbox was used to 
perform this test. “GLSTAT” first estimates the bispectrum using a direct Fourier 



transform approach, from which the bicoherence is estimated. The test statistic, which 
is the mean bicoherence power, is computed as 

(2.17) 

where the summation is performed over the non-redundant region defined by (2.14). The 
test statistic S has a x 3 distribution with the number of degrees of freedom determined 
by the length of the discrete Fourier transform (DFT) length and and the resolution 
parameter. This test is described in detail in [20] and [17]. 

Table 2.5 lists the results from testing the same noise sequences listed in 
Table 2.1 and Table 2.3. N, is the number of points in the data sequence, 5 is the test 


| Noise Source | N, \ 

S I Pfa | 

Symmetric? 

Gaussian 

20000 

0.7444 

1.0000 

yes 

uniform 

20000 

0.4885 

1.0000 

yes 

12 uniforms 

20000 

0.5393 

1.0000 

yes 

Laplacian 

20000 

0.8936 

1.0000 

yes 

exponential 

20000 

167.80 

0.0000 

no 

Bernoulli-Gaussian 

20000 

| 0.7281 

1.0000 

yes 


TABLE 2.5: RESULTS OF HINNICH’S TEST FOR GAUSSIANITY USING SYN¬ 
THETICALLY GENERATED NOISE SEQUENCES. 

statistic, and Pfa is the probability of incorrectly accepting the non-zero skewness hy¬ 
pothesis. The DFT length for all sequences was 256 points and the resolution parameter 
was chosen as 0.51 (the default value used by the “GLSTAT" function), which resulted 
in 48 degrees of freedom for all data sequences. If the Pfa is high there is strong evidence 
that the non-zero skewness hypothesis can be rejected in favor of the zero skewness hy¬ 
pothesis. If the Pfa is lower than about 0.05 the probability is high that the data have 
non-zero skewness, and are therefore non-Gaussian. All of the noise sequences, with the 
exception of the exponential sequence, had high values of Pfa , indicating that the zero 
skewness hypothesis can not easily be rejected. 
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Table 2.6 lists the results from testing the same noise sequences listed in 
Table 2.2 and Table 2.4. All of the real noise sequences passed the test. We noticed 


Noise Source | N, 

s 

Pfa 

Symmetric? | 

orca 

12000 

21.750 

0.9996 

yes 

tk-lOa 

20000 

1.0470 

1.0000 

yes 

bio-2385 

20000 

52.970 

0.2882 

yes 

wxOl 

20000 

0.5171 

1.0000 

yes 

qx03 

20000 

0.5502 

1.0000 

yes 

fsO'l 

20000 

0.9484 

1.0000 

yes 


TABLE 2.6: RESULTS OF HINNICH’S TEST FOR GAUSSIANITY USING 
RECORDED OCEAN NOISE. 

that the DFT length can significantly affect the values of both the test statistic S' and 
the Pfa for a correlated noise sequence, indicating the importance of not rejecting the 
zero-skewness hypothesis based on performing the test using only one DFT length. For 
example, we found that a colored Gaussian noise sequence failed the above test using a 
DFT length of 256, but passed the test using a DFT length of 255 or 257. Similarly 1 ,, 
the Pfa for sequence bio-2385 was only 0.2882 with a DFT length of 256 but was 0.9986 
with a DFT length of 257. Although we do not have a detailed explanation, we assume 
that the variation is due to some anomaly (such as periodicity) introduced into the data 
at or near a specific DFT length that does not otherwise appear. 

Because of the computational complexity involved with estimating the fourth- 
order cumulant, efficient quantitative statistical tests based on the fourth order cumulant 
have not yet been developed. Therefore a qualitative test was performed in the same way 
that the qualitative tests were performed using the coefficients of skewness and kurtosis. 
Since the fourth order cumulant is cumbersome to estimate for all lag values one can 
obtain a “sample’" of it by estimating its value on a one-dimensional slice (i.e.. a line 
through the space of lag values). Although any one-dimensional slice can be used, we 
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chose the diagonal slice, which is equivalent to setting /j = l 2 = 1 3 in (2.12). The maxi¬ 
mum value of this one-dimensional slice is compared with the theoretical value of zero. 
If this maximum value significantly deviates from zero then there is a strong possibility 
that the sequence is not Gaussian. 

The Matlab function “CUMEST” from the “Hi-Spec” toolbox was used to 
estimate the diagonal slice of the fourth order cumulant, C^’ d \ using the “overlapped- 
segment method” described in [17]. The biased estimate was computed using a segment 
length of 1000 points with 50 percent overlap, and lag values ranging from l — —100 to 
l = +100. The results are listed in Tables 2.7 and 2.8. 


| Noise Source 

N, 

max(C'I 4, ‘0) 

Gaussian 

20000 

0.0534 

Uniform 

20000 

-1.2011 

12 uniforms 

20000 

-0.0491 

Laplacian 

20000 

3.1137 

exponential 

20000 

6.2762 

Bernoulli-Gaussian 

20000 

0.1939 


TABLE 2.7: RESULTS OF THE TEST FOR GAUSSIANITY USING THE DIAGO¬ 
NAL SLICE OF THE FOURTH ORDER CUMULANT OF SYNTHETICALLY GEN¬ 
ERATED NOISE SEQUENCES. 


The results listed in Table 2.7 are similar to the results from testing for the 
coefficient of kurtosis listed in Table 2.3. Note that only the Gaussian noise sequence and 
the sequence formed by summing 12 independent uniform sequences yielded low values 
for the maximum value of The other synthetic noise sequences had high values 

for . indicating that they have a low probability of being Gaussian. 

The results listed in Table 2.8 are likewise similar to the results from testing 
for the coefficient of kurtosis listed in Table 2.4. All of the real sequences, with the 
exception of orca, yielded small values of Cl i,d \ indicating that these sequences do not 
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| Noise Source | N. \ 

max(CT 1 ’' 1) ) 

orca 

20000 

| 0.1242 | 

tk-lOa 

20000 

-0.0603 

bio-2385 

20000 

-0.0705 

wxOl 

20000 

0.0543 

qxOS 

20000 

0.0607 

fx09 

20000 

0.0407 


TABLE 2.8: RESULTS OF THE TEST FOR GAUSSIANITY USING THE DIAGONAL 
SLICE OF THE FOURTH ORDER CUMULANT OF RECORDED OCEAN NOISE. 

significantly deviate from Gaussianity. The relatively large maximum value of CM 
for orca indicates that this sequence is possibly non-Gaussian. This departure from 
Gaussianity however is not strong since the maximum value of for orca is not 

nearly as large as that obtained for the known non-Gaussian noise sequences. 

3. Statistical Conclusion 

The results of the statistical tests indicate that all of the recorded noise se¬ 
quences we considered in this thesis can reasonably be assumed to have a Gaussian 
distribution. Most of the noise sequences responded to the test with strong evidence of 
Gaussianity. Only the orca noise sequence gave evidence of non-Gaussianity and the evi¬ 
dence in most tests was not strong. These results imply that linear Gaussian techniques 
are appropriate vehicles to obtain models and filters for these noise sequences. This is 
fortunate, since very few real processes if non-Gaussian are linear [21] and methods for 
modeling of non-linear and non-Gaussian random processes are only in the early stages 
of development. 
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III. MODELING 


The process of signal modeling involves designing a filter to produce a sequence with 
characteristics which are identical to a desired random process. In our case the desired 
random process is ambient ocean noise. Because the recorded ocean noise tends to have a 
Gaussian distribution, only the second-order statistics of the original sequence need to be 
matched (a Gaussian process is completely defined by its second-order statistics). Since 
a linear transformation of a Gaussian random process preserves its Gaussian nature, an 
appropriate signal model is a linear model driven by Gaussian white noise. Additionally, 
since the recorded noise also tends to be stationary over the time intervals of interest, 
a time-invariant model can be used. Finally, since we desire stable models with stable 
inverses for the application described in Chapter IV, the models we shall focus on will 
be minimum phase linear models. 

A. TYPES OF LINEAR MODELS 

Three basic linear time-invariant models are considered: the autoregressive (AR), 
the moving average (MA), and the autoregressive moving average (ARMA) models. These 
models are illustrated in Figure 3.1, where w(n) is a Gaussian white noise sequence of 
unit variance, x(n) is the modeled sequence, and A(z) and B(z) are the polynomials in 
the denominator and numerator, respectively, of the filter transfer function defined by 

A(z) *= do + a\Z 1 + a 2 z 2 + • • • + apz p (3.1) 

B(z) = b 0 + hz - 1 + b 2 z- 2 + ■ • ■ + bgz-* (3.2) 
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Figure 3.1: Types of Linear Models, (a) AR. (b) MA. (c) ARMA 

All three models can be described in the time domain by the difference equation for the 
general ARMA model, 

x(n ) + a\x(n — 1 )-(-••• 4- apx(n — /’)== haw(n) + biiefo — 1) + • • • + bqw(n — Q) (3.3) 

The AR. model can be obtained from (3.3) by setting P ^ 0 and Q = 0: the MA model 
can be obtained by setting P = 0 and Q £ 0. [12] 

Because the AR model can have poles near the unit circle, the AR model is best 
for representing sequences with peaks in the power spectrum. Correspondingly, since the 
MA model can be designed with zeros near the unit circle, the MA model is best for 
representing sequences with nulls in the power spectrum. The ARMA model works well 
for sequences with both peaks and nulls in the power spectrum, as well as other general 
spectra without pronounced peaks and nulls. The models are somewhat interchangeable 
in that an AR model can be used to estimate an ARMA or MA process if the model order 
is chosen sufficiently high and a MA model can be used to estimate an AR or ARMA 
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process with a high enough model order. The best results are obtained however when 
the model exactly matches the process which generated the sequence. [12] 

1. The AR Model 

The simplest model to solve is the AR or all-pole model because its solution 
depends entirely on a system of linear equations. We chose to use Burg's method [12] to 
find the afs of (3.1) because it ensures a minimum phase solution (all roots of A(z) inside 
the unit circle) and so guarantees that both the model and its inverse will be stable. 

Since Burg’s method is based on a least squares procedure, better results are 
obtained if the noise sequences to be modeled are very long. Using 10,000 to 20,000 points 
gives the best results without taking too much time to compute. Using more than 20,000 
points serves only to increase the computation time without providing better results. 
Using less than a few thousand points does not seem to capture enough of the noise 
characteristics to obtain the best models. 

2. The MA Model 

The MA. or all-zero, model, even though it seems to have the simplest form, 
is in general difficult to solve because its solution depends on a system of nonlinear 
equations. If the sequence to be modeled was generated by an MA process, then the 
transfer function polynomial can be obtained by spectral factorization of the spectral 
density function of the original process. The spectral density function S m (z) is found be¬ 
taking the z-transform of the autocorrelation function of the random process. Since for 
a white noise input S m (z) is of the form 

S.(z) = B(z)B'(im (3-4) 

B{z) can be solved for by finding the roots of S x (z). Multiple solutions are generally 
possible because the roots of S m (z) occur in complex reciprocal locations, giving rise to 


26 




minimum phase, maximum phase, and possibly mixed phase solutions for B(z). If the 
signal to be modeled is deterministic or non-Gaussian it is important to choose the roots 
with the correct phase in order to obtain an accurate model. If the signal to be modeled 
is a Gaussian random process, then the maximum phase, mixed phase, or minimum 
phase solutions are all equally valid since each of these models yields identical second 
order statistics. However since subsequent processing requires the inverse of the filter, 
we chose the minimum phase solution of B(z) by selecting the roots of S m (z) which lie 
inside the unit circle. [12] 

The MA model turned out to be ineffective for our purpose for two reasons. 
First, models of low order generally do not yield well modeled noise. Higher order models 
can improve the quality of the modeled noise, but these higher model orders usually have 
zeros of B{z) close to or on the unit circle, which causes stability problems for the 
inverse model. Secondly, the spectral factorization is subject to errors. The factorization 
algorithm does not always yield the same number of minimum and maximum phase 
roots, and poles on the unit circle rarely occur in even multiplicities. These errors may 
be attributed to roundoff error in factoring S*(z) and the error associated with estimating 
the correlation function of the original noise sequence. 

3. The ARMA Model 

The ARMA. or pole-zero, model is the most flexible of the models and typically 
requires the fewest number of model parameters. Optimal methods for estimating the 
ARMA model parameters involve solving for the AR and MA parameters simultaneously. 
Unfortunately, optimal methods (usually iterative and based on maximum likelihood or 
related concepts) are not well developed because they require the solution of a system of 
highly nonlinear equations, and may not converge to a correct solution. The preferable 
methods are suboptimal , and usually involve solving for the AR and MA parameters 
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separately. The advantage to these techniques is the significantly reduced computational 
complexity. If the original random process was truly generated by an ARMA process the 
model can be obtained by finding the AR parameters first and then filtering the original 
sequence by the filter formed from the AR parameters, yielding a residual sequence which 
is a pure MA process. The MA parameters can then be found by spectral factorization 
of the residual sequence. [22, 23] 

The main advantage of ARMA modeling for deterministic signals is that it 
generally yields the most parsimonious representation of the original sequence [12]. This 
is of tremendous usefulness in applications where data compression or simplicity are 
important. Consequently, ARMA models are favored for modeling transient signals. 
(For example, a 2000 point transient signal could be modeled by segmenting the data 
into, say. forty fifty-point segments, and then modeling each segment with perhaps a 
{P,Q) = (6,5) ARMA model. The modeled transient then has only 440 parameters 
instead of the original 2000.) For noise modeling, however, a parsimonious representation 
is not necessarily advantageous because any sequence of stationary noise, regardless of 
length, requires only one model, and therefore only one set of calculations. (One is not 
normally interested in purposefully transmitting compressed noise.) 

For modeling noise the ARMA model does not give better results than the AR 
model because it suffers from the same pitfalls as the MA model: spectral factorization 
errors and inverse filters with poor stability characteristics. Additionally, any reduction 
in parameters in the ARMA over the AR model for noise is usually offset by the increased 
computational complexity of finding the ARMA model parameters. 

B. MODEL ORDER SELECTION 

Choosing the correct model order for stochastic data is not a trivial problem. We 
analyzed four methods for determining the best model order for our data. Theoretically, 
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if the data can be described by a finite order AR model and if the model order is chosen 
correctly, then the theoretical prediction error variance will remain constant for all model 
orders higher than the theoretical model order. Real data is seldom generated by an exact 
AR process, so it may be difficult to find the model order at which the prediction error 
variance becomes constant. This section examines a number of methods to approximate 
the best model order. 

1. Theoretical Criteria 

The four most common methods for chosing the model order seek to minimize 
a quantity related to the prediction error variance of and the number of data points N,. 
These quantities, which each have a distinct minimum at the optimal model order, are 
Akaike’s information-theoretic criteria (AIC), Parzen’s criterion autoregressive transfer 
(CAT). Akaike’s final prediction error (FPE), and Schwartz and Rissanen’s minimum 
description length (MDL). The formulas for these quantities are 


AIC(P) = 

N, In a\ p + 2P 

(3.5) 

CAT(P) = 

N.ol, ) N.a\ p 

(3.6) 

FPE(P) = 

, (N. + P + 1\ 


ff ' F (n.-p- l) 

(3.7) 

MDL(P) = 

N, In a\ p + Pin N, 

(3.8) 


Figure 3.2 demonstrates the use of one of these criteria, the AIC, which does indeed have 
a minimum at P = 17. Unfortunately, since these quantities work well only if the data 
are generated by an AR process, their use can be severely limited for real data. [12] 
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Model Order 


Figure 3.2: Example of Model Order Selection Based on the AIC Criteria for Sequence 
wx02 Using 2000 Points of Noise Data . 

One major drawback of these criteria for modeling stochastic data is their de¬ 
pendence on the data length. Long sequences of data (on the order of 10,000 points) 
typically indicate very high filter orders, while short data sequences (on the order of 1000 
points) typically yield much lower filter orders. For example, when the data sequence 
u txO‘2 was tested using 2000 points of noise, a 17 th order model was indicated, while using 
20,000 points of noise indicated that a 98 ih order model should be used. The dilemma 
is that the accuracy of least-squares techniques, such as Burg's method, improve with 
increasing data length, thereby yielding better model parameters. The theoretical model 
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orders indicated by the above methods can indicate erroneously high model orders how¬ 
ever. and are therefore only of limited usefulness in modeling stochastic data which are 
not generated by a truly AR process. 

2. Computation of Prediction Error 

In this analysis the prediction error for linear predictive filtering was computed 
and tested. For a well chosen model order the output of the prediction error filter pictured 
in Figure 3.3 will be Gaussian white noise of unit variance. The filter transfer function 



Figure 3.3: Prediction Error Filter. 


is given by (3.1) where the a< parameters are the same as those in (3.3). For each model 
order selected, the original data was filtered with the prediction error filter and the 
spectrum of the resulting error signal was analyzed. The model order was chosen as the 
lowest order which gave a reasonably fiat spectrum for the error signal. Figure 3.4 shows 
an example of using the spectrum of the PEF output. In this example an 11 th order AR 
model was used to model 20,000 points of noise from the sequence wx02. Notice that 
the variation for the PEF output spectrum is slightly greater than the variation of the 
spectrum for the white noise. The variance could be reduced by a slightly higher order 
model but other tests described below give some rationale for retaining the 11 th order 
AR model. 

3. Spectral Density Comparison 

In this test, the AR parameters for the noise are obtained using Burg’s method 
as before. The model in Figure (3.1 )(a) is then driven with Gaussian white noise of 
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Comparison of PEF Output Spectrum and White Noise Spectrum 



Figure 3.4: Example of Model Order Selection Based on the Flatness of the Prediction 
Error Filter Output Spectrum for Sequence wx02 Using 20,000 Points of Noise Data. 


unit variance. The spectrum of the noise is then compared with the spectrum of the 
model to see if they are reasonably close. Figure 3.5 shows the results of comparing the 
spectrums of the noise and the model using the same example as that used in Figure 3.4. 
This comparison shows that an 11 th order model performs reasonably well for this noise 
sequence, although a higher order model could possibly match the noise better. 
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Comparison ot Power Spedral Density Estimates 



Normalized Frequency 


Figure 3.5: Example of Model Order Selection Based on Comparison of the Noise Spec¬ 
trum and the Model Spectrum for Sequence wx02 Using 20,000 Points of Noise Data. 

4. Aural Evaluation 

As a further means of testing, we can generate models of various orders and 
determine the lowest order model which gives the best sounding results. This test is 
performed by listening to the real noise and the modeled noise played back-to-back. The 
model is rejected if it is possible to hear the transition point between the noise and 
the model. Modeled noise which sounds indistinguishable from real noise can be gener¬ 
ated using model orders typically lower than those predicted using either the theoretical 
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criteria or any of the other tests. The listening tests generally agree with the results 
obtained by comparing the noise and model spectrums, in that when it is possible to 
hear the difference between the noise and model then the spectra are also noticeably 
different. An ll' h order model was chosen for the above examples because this was the 
lowest model order for which it was not possible to hear the distinction between the real 
noise and the modeled noise. This demonstrates that aural evaluation is effective if a 
good sounding model is desired, but may not necessarily be the best if the most accurate 
model is required. The listening test and the prediction error filter output test turned 
out to give the most useful practical results. 



IV. FILTERING AND RESULTS 


An observed noisy transient- signal can be thought of as a random process ®(n) 
which is related to another random process s(n) which cannot be observed directly. If 
x(n) is the transient with additive background noise, then s(n) is an original transient 
signal free of noise. The goal of filtering is to estimate s(n) from x(n) for all n. The 
optimal filter which performs this estimation by minimizing the mean square error is 
known as the Wiener Filter. Its design is based on the statistical characteristics of the 
signal and the noise. If the noise is Gaussian, then the Wiener filter is the best filter (in 
the mean-square sense) for removing the noise. If the noise is not Gaussian, then the 
Wiener filter is the best linear filter for removing the noise although in general there may 
be a nonlinear filter which will perform better. [12] 

The Wiener filter is usually designed for filtering stationary signals, which is a 
reasonable assumption for the noise but a poor assumption for the transient signal whose 
statistical characteristics change significantly over a short time. Accordingly we have 
extended the Wiener filter by considering a short-time approach which is effective for 
removing the noise from non-stationary signals, such as underwater transient signals. 

A. WIENER FILTERING OF STATIONARY RANDOM SIG¬ 
NALS 

A signal in noise can be represented as 

x(n) = s(n) + Tt(n) (4.1) 

where x(n) is the received signal, s(n) is the signal without noise, and y{n) is the noise. 
It is assumed here that all quantities are real and have zero mean. When the signal 
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and noise are uncorrelated, as is typically the case for observed transient signals, the 
autocorrelation of the noisy signal is given by 

R*(l) = R.U) + Rn(l) (4.2) 

where RAJ) is the autocorrelation of the signal plus noise, R.(l) is the autocorrelation 
of the signal, and R^J) is the autocorrelation of the noise. If a properly designed finite 
impulse response (FIR) filter is applied to the data then an estimate of the signal can be 
obtained from the convolution expression 

s(n) = X>(0*(n-D (4.3) 

(=o 

where h(n) is the impulse response and P is the order of the optimal filter. [12] 

The Wiener filter, that minimizes the mean-square error, satisfies the Wiener-Hopf 
equation 

pR x (l-i)h(l) = R„(iy, t = 0,1,..., P — 1 ' (4.4) 

where, since the signal and the noise are uncorrelated, the cross-correlation function 
between s and x is given by 

= R.(l) (4.5) 

Using (4.5) in (4.4) and writing each of the P equations explicitly yields the matrix form 
of (4.4) as 


' RAO) 

RA- 1) • 

• • r,(i-p) i r h( o) i 

r *.(o) i 


R*( i) 

fl.(0) • 

•• R x (2-P) h( 1) 

RAD 

(4.6) 



: : = : 

. RAP- i] 

RAP-2) • 

•• RAO) JU(F-i)J 

[ RAP- i) J 



from which the filter weights h(l) can be solved by simple matrix algebra. 

B. SHORT-TIME APPROACH TO WIENER FILTERING 

The Wiener filter described above works well for stationary signals, but several mod¬ 
ifications are needed for non-stationary signals. In the short-time approach to Wiener 
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filtering the signal of interest is segmented into short-time stationary segments, where 
it is assumed that signal changes relatively slowly over small intervals. What consti¬ 
tutes short-time stationarity is largely subjective and depends on the signal. Speech, 
for example, is normally considered stationary over periods of 20 to 30 msec [24]. More 
slowly varying transients can be considered stationary over intervals of perhaps tenths 
of a second, while more quickly changing signals, such as impulsive signals, may only 
appear stationary over intervals of several microseconds. 

1. Autocorrelation Estimates 

In the short-time approach the autocorrelation function of the data is estimated 
for each segment, the filter weights are calculated for each segment, and finally each 
segment is filtered with the appropriate filter. Since the exact autocorrelation functions 
necessary to construct (4.6) are unknown, they must be estimated from the data. The 
biased estimate 

1 -V. - 1 

R x (1) = y E *(n + 0*(»»); 0 <1<N, (4.7) 

is used because it guarantees that the autocorrelation matrix in (4.6) will be positive 
definite and therefore nonsingular. Since the correlation function is symmetric. (4.7) 
needs only to be computed at zero and positive lag values. This estimator is asymptoti¬ 
cally unbiased and consistent, which means that the estimate converges in probability to 
the actual correlation function as the number of samples tends to infinity. This implies 
that a compromise must be made in finding the best filter since stationarity requires the 
shortest possible segmentation while a good estimate of the autocorrelation requires the 
longest possible segments. [12] 

Equation 4.6 requires the autocorrelation functions of both the noisy signal x(n ) 
and the noise-free signal s{n). To estimate these correlation functions Rr,(l) can first be 
estimated by applying (4.7) to segments of the signal which contain only noise. Similarly, 
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R x (l) can be estimated for each segment containing the signal plus noise. R,(l) cannot 
be estimated directly, but it can be obtained by from (4.2) as 

R.(D = Ml) - Ml) (4.8) 

2. Pre-Whitening 

Since the estimates for R,{1) obtained by (4.8) can have some error, the filter 
for some segments of the signal can give poor results. Note that the estimate for R.(l) 
is not guaranteed to be positive definite even if the estimates for both R*(l) and R v (l) 
are positive definite. The estimate of R,(l) can be improved in practice by pre-whitening 
the signal because it reduces the problem of estimating the correlation function for the 
noise to that of estimating just a single parameter (the noise variance). Pre-whitening is 
accomplished by obtaining a good AR model for the noise using the method described in 
Chapter III. and then filtering the noisy data x(n) with the prediction error filter (PEF) 
formed from the inverse model, as shown in Figure 4.1. If the AR model order was well 



jM. 



bo J 



Figure 4.1: Prediction Error Filter. 


chosen, then the output of the PEF x'(n ) will be white noise for noise only input. For 
sections of the original signal containing signal plus noise the output of the PEF will be 
a colored version of the original signal plus white noise, that is 

x’{n) = s'(n) + r/'(n) (4.9) 

where r]'(n) is Gaussian white noise. If the short-time filtering approach is applied to the 
output of the PEF rather than to the original signal, then R^(l) can be assumed to be 




the autocorrelation function of Gaussian white noise, which is simply an impulse with 
value equal to the estimated variance (ideally the noise variance will be 1.0). R',(l) can 
then be estimated by simply subtracting this impulse from R a (l). The advantage in this 
approach is that only the zero lag is affected by the subtraction whereas in the previous 
method all lag values were affected by the subtraction of Rx(l) and R^(l). This new 
approach tends to reduce the effect of accumulated errors. For the sections of the signal 
containing only noise, ideally R x (l) = R^{1) since R',(l) = 0. This should yield h(n) = 0 

for n = 0,1__ P — 1, which means that the noise (which is the entire received signal) 

should be entirely removed from segments containing only noise. 

it is important that not too much be subtracted from fi,(0) because this can 
cause the correlation function R,i(l) to become indefinite and produce significant distor¬ 
tion. We chose to filter several segments of noise through the PEF and use the smallest 
of the resulting variances as the value to subtract from /?*(0). This conservative estimate 
decreases the chance of subtracting too much and lessens the chance that R,{1) will be¬ 
come indefinite. In practice it was found that subtracting too little gives better results 
than subtracting too much. 

Once the filter has been determined, the short-time Wiener filter is applied to 
the colored signal x'(n) in order to produce an estimate of the clean colored signal s'(n). 
Then s'(n) is filtered with the AR model in order to obtain an estimate s(n) of the 
original signal. Figure 4.2 illustrates this process. 


J *(*) 

r'(n) 

Wiener 

*(«) 

bo 

! 


Filter 

1 a(2) ! 


Figure 4.2: Pre-Whitening and the Short-Time Wiener Filter. 
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3. Smoothing Discontinuities 


After all of the signal segments have been filtered the complete signal is con¬ 
structed by joining all the segments into one signal. Since each segment is filtered with 
a slightly different filter, some smoothing techniques must be applied so that the differ¬ 
ences at the segment boundaries do not become noticeable. These differences produce a 
distortion that is mainly from the noise that remains in the signal after filtering, and is 
most noticeable in the segments containing more noise than signal. 

a. Initializing the Filter Conditions 

One technique to help smooth the discontinuities is to use the final condi¬ 
tions from the filter of one segment as the initial conditions of the filter for the succeeding 
segment. If the segment lengths provide a reasonably good approximation of stationarity, 
then the filter from one segment should not be significantly different from the filters on 
either side. Therefore the final conditions of one filter should give a fairly good approx¬ 
imation of what the initial conditions should be in the next filter. This technique thus 
reduces the effects of an unwanted transient response caused by the lack of appropriate 
initial conditions. 

b. Overlap-averaging 

Another way to minimize the effects of the boundary discontinuities is pre¬ 
vent any filtered segment from beginning or ending abruptly. One way to accomplish this 
is as follows. Each filtered segment is weighted by a triangular window, as illustrated in 
Figure 4.3 (a) and (b), to form a windowed sequence Sj(n). Then the original signal is 
resegmented as shown in Figure 4.3 (c) and windowed, as shown in Figure 4.3 (d), to 
form another windowed sequence Sjfn). The two windowed sequences are then added to 
form the complete signal s(n) as shown in Figure 4.3 (e). The effect of adding s x i n ) and 
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i 2 (n) is to continually phase out one segment while phasing in the next, thus smoothing 
the discontinuities between segments. 


(a) 


(b) 


(c) 


(d) 


(e) 


Figure 4.3: Overlap-Averaging Technique for Smoothing Boundary Discontinuities, (a) 
Segmented signal, (b) Amplitude scaling for the segmented signal, (c) Re-segmented 
signal, (d) Amplitude scaling for the re-segmented signal, (e) Signal smoothed by overlap¬ 
averaging. 


4. Forward-Backward Filtering 

The FIR filtering process described above induces a linear phase shift in the fil¬ 
tered signal. This phase shift can be compensated for by filtering in the forward and back¬ 
ward directions. For the process described in the previous section this is accomplished 
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by filtering each segment in the forward direction using the final condition of each filter 
as the initial condition for the next filter. After filtering each segment, the process is 
reversed. Beginning with the final filter segment the reversal of each filtered segment is 
then filtered again, with the same filter used in the forward direction. 

5. Filter Parameters 

Four parameters must be chosen in order to obtain effective filtering. The most 
important consideration is the segment length of the original signal, which should be 
as many points as possible. Ideally, the segments should each contain 1000 or more 
points of data because the accuracy of the autocorrelation estimate improves with the 
number of points used. This is not always possible due to the competing requirement for 
stationarity. The second consideration is the order of the FIR Wiener filter. High order 
filters work well if the autocorrelation function is estimated well, as with data which 
changes slowly and can be segmented into long segments. Low order filters give better 
results for data which changes quickly and therefore requires the use of smaller segments. 
Finally, the noise must be modeled well which means that a representative segment of 
the noise must be used and an appropriate model order for the noise must be chosen, as 
described in Chapter III. 

C. RESULTS 

1. Graphical Results 

Figures 4.4 through 4.7 show the results of filtering several transient signals. The 
magnitude scale for each signal represents the integer value of the output of the analog- 
to-digital converter, not the actual strength of the signal in the ocean (for example, 
16-bit quantization yields a magnitude scale from -32,768 to +32,767). Figure 4.4 is 
data from a killer whale song. The sampling rate is 22.05 kHz, the segment length is 
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50 msec (1102 points), the filter order is 50, and the noise was modeled with a 35 tA 
order AR model using 12,000 points of data. Since the signal had a high sampling rate 
it was possible to segment, the data into long segments and still maintain a reasonable 
approximation of stationarity, allowing the use a high order filter. A significant amount 
of noise was removed from filtered signal with little added distortion. Figure 4.5 is data 
from a porpoise whistle. The sampling rate is 12.5 kHz, the segment length is 200 msec 
(2500 points), the filter order is 50, and the noise was modeled with a 35 fk order AR 
model using 20,000 points of data. The sampling rate was slower than that used for 
Figure 4.4, but the signal changed much more slowly, allowing for longer segmentation 
and a high order filter. The filtering process was particularly effective for this transient, 
as the noise was nearly entirely removed from the signal with almost no added distortion. 
Figure 4.6 is data from a stochastically generated transient. The sampling rate and time 
are not mentioned here in order to avoid revealing the source. The segment length is 500 
points, the filter order is 10, and the noise was modeled with a 35 th order AR model using 
20.000 points of data. The characteristics of the signal changed quickly with respect to 
the sampling rate, which necessitated the short segment lengths and the low order filter. 
Much of the original noise was removed, but the distortion was more noticeable than with 
the two previous examples. Figure 4.7 is data from an impulsive source. The sampling 
rate and time are also not mentioned. The segment length is 500 points, the filter order 
is 20, and the noise was modeled with a 35 th order AR model using 20,000 points of data. 
A short segment length and low filter order were similarly required for this transient since 
its characteristics changed quickly with respect to the sampling rate. This was generally 
true for all of the transients we tested which were impulsively generated. 

The amount of distortion caused by the filtering process depends, in general, on 
how quickly the signal changes with respect to the sampling rate. The obvious implication 
is that the short-time Wiener filter should be more effective with signals sampled at very 
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high sampling rates. If the filtered signals are to have synthetically generated noise added 
to them, as described in the next section, then the distortion is not a significant problem 
since the added noise typically hides small amounts of distortion. 

2. Results of Aural Testing 

A subjective listening experiment was developed to see if human listeners are 
able to distinguish between real underwater transient signals and synthetically generated 
transient signals produced by the technique described in this thesis. The goal of this 
experiment was to demonstrate that by effective use of noise modeling and noise removal 
it is possible to generate synthetic transient signals which are indistinguishable from real 
transient ignals. The experiment consisted of two parts and twelve individual tests. Part 
one contained tests one through eight, which each consisted of listening to three signals: 
two real transient signals in noise and one synthetically generated transient signal in 
synthetically generated noise. In Part one each listener was asked to identify the one 
synthetically generated signal for each of the tests. Table 4.1 indicates how the signals 
were generated. The synthetic signals in tests 1, 5, and 7 were generated without first 


Test Number 

Signal A 

Signal B 

Signal C 

1* 

real wxOl 

synthetic wxOl 

real wx02 

2 

synthetic wxOl 

real wxll 

real wxOl 

3 

real wx02 

synthetic wx03 

real wx03 

4 

synthetic qxOl 

real qx02 

real qx03 

5* 

real qxOl 

real qx02 

synthetic qx03 

6 

real qxOl 

synthetic qx02 

real qx03 

7* 

real porpoise 

synthetic porpoise 

real porpoise 

8 

synthetic porpoise 

real porpoise 

real porpoise 


TABLE 4.1: TESTS COMPRISING PART ONE OF THE SUBJECTIVE LISTENING 
EXPERIMENT. The * indicates the tests with signals which were synthesized without 
first removing the noise. The wx and qx signals are as described in Chapter II. 
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removing the noise. The rest of the synthetic signals were generated by first removing 
the noise, then synthesizing the transient, and finally adding synthesized noise at an 
appropriate signal to noise ratio. Synthesizing some signals without first removing the 
noise was intended to show the that noise removal and modeling improves the “realness" 
of the synthetic signals. 

Part two contained tests nine through twelve, which each consist of three real 
transient signals in noise: one transient signal which is unaltered and two transient signals 
which have the original noise replaced by synthetically generated noise. In Part two each 
listener was asked to identify the signal which was unaltered. Table 4.2 indicates how 
the signals in Part two were generated. 


Test Number 

Signal A 

Signal B 

Signal C 

9 

original wxOl 
noise from qxOS 

filtered wxOl with 
noise from orca 

filtered wxOl with j 

10 

filtered porpoise with 
noise from wx02 

filtered porpoise with 
noise from orca 

original porpoise 

11 

filtered qx03 with 
noise from wx02 

original qxOS 

filtered qxOS with 
noise from wx02 

12 

filtered porpoise with 
noise from porpoise 

original porpoise 

filtered porpoise with 

1 noise from porpoise 


TABLE 4.2: TESTS COMPRISING PART TWO OF THE SUBJECTIVE LISTENING 
EXPERIMENT. 

All of the signals in tests 1, 2, 3, and 9 were generated from a common type of 
source and noise background. Similarly, the signals in tests 4. 5, 6, and 11 are all from 
the same type of source. The signals in tests 7, 8, 10, and 12 are all porpoise whistles 
embedded in noise. Placing signals of the same class in each test was intended to give the 
listeners the advantage of being able to compare the synthetic signals with real signals, 
and possibly improve their scores. The tests were also designed to deceive the listeners 
by selecting the signals in such a way that they are drawn to false clues, such as signal 
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to noise ratios or other characteristics. For example, tests 2 and 3 each consist of two 
real signals which sound different from one another along with a synthetic signal which 
sounds very similar to one of the real signals; in test 12 the signal to noise ratio is varied. 
This “trickery” is included in order to demonstrate that the listener is forced to resort to 
clues which have nothing to do with whether or not the signal is real, thereby showing 
the effectiveness of the synthesis. The synthetic signals for the tests were all generated 
using the iterative Prony method [6], with a segment length of forty samples and a model 
order of twelve. The synthetic noise in each test was generated with a 35 th order AR 
model of various noise sources. The filtered signals were obtained by using short-time 
Wiener filtering with the filter parameters chosen to give the best sounding signals. 

Of the 12 people who participated in the test, five are submarine officers, four 
are professors at the Naval Postgraduate School with extensive signal synthesis experience 
(one is also a former submarine officer), two are Navy sonar operators, and one is a P-3C 
(an anti-submarine warfare aircraft) Tactical Action Coordinator. Table 4.3 lists the 
results of the first part of the subjective listening test. The numbers in the table indicate 
how many “votes” each signal received, with the correct choice indicated by the boxes. 


Test Number 

Signal A 

Signal B 

Signal C 

1* 

0 

M 

1 

2 

ni4i 

6 

2 

3 

H 

M 

0 

4 

w 

6 

2 

5* 

i 

0 

uy 

i 6 

3 

1 

.1 

8 

7* 

2 


) 

1 

8 


1 

5 


TABLE 4.3: RESULTS FROM PART ONE OF THE SUBJECTIVE LISTENING EX¬ 
PERIMENT. The * indicates the tests with signals which were synthesized without first 
removing the noise. 
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Again the * indicates the signals which were synthesized without taking the effects of 
the noise into account. Notice that these tests (marked with an *) received the most 
correct votes, showing that traditional synthesis techniques can yield poor models when 
background noise is present. With the noise taken into account in the synthesis process 
and with well designed models the participants could not consistently tell the difference 
between real and synthesized signals and frequently made wrong choices. In some cases 
nearly every participant was deceived by the data (see the results from test 3 and 6). 

Table 4.4 lists the results of the second part of the subjective listening Exper¬ 
iment. Once again the numbers in the table indicate how many “votes” each signal 


Test Number 

Signal A 

Signal B 

Signal C 

9 

M 

0 

3 

10 

6 

1 

5| 

11 

8 


\T\ 

I 

12 

5 

0Z 

3 


TABLE 4.4: RESULTS FROM PART TWO OF THE SUBJECTIVE LISTENING EX¬ 
PERIMENT. 

received, with the correct choice indicated by the boxes. The results of this part show 
that it is possible to change the noise background of a signal (either by replacing the 
original noise with a different type of noise or by replacing the original noise by the same 
type of noise but with a different SNR) without destroying its “authenticity.” In all but 
one of the tests, listeners were unable to consistently identify any of the original signals. 

It was expected that each listener would correctly choose the signals which were 
synthesized without noise removal, yielding a potential minimum score or three. It was 
further expected that, given good signal synthesis, each listener would randomly choose 
the correct answers from the remaining nine tests, giving an expected score of three out 
of the remaining nine. This yields an expected score of six. An average score higher than 
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six would indicate that the signals were not convincingly real. An average score of less 
than six would indicate that the listeners were drawn to false clues, thereby verifying the 
effectiveness of the synthesis. The number of correct scores for the twelve tests ranged 
from one to ten, with the average number of correct answers being 5.5. 

Qualitatively, all of the participants expressed their difficulty in selecting the 
correct choice. In many cases it came down to their “best guess.” All of the participants 
noted that the signals synthesized by removing the noise prior to modeling were superior 
to those synthesized without noise removal. 



V. CONCLUSIONS 


A. DISCUSSION OF RESULTS 

Synthesis of underwater transient signals can be significantly improved if the ambient 
ocean noise is taken into account in the synthesis process. The noise, which is typically 
Gaussian, can be filtered from the signal using linear techniques. In particular, we 
demonstrated that the Wiener filter, which is an optimal linear filter, can be effectively 
modified for use in a “short-time” manner in order to filter stationary noise from a 
transient signal. In order to make the short-time Wiener filter perform satisfactorily 
a number of techniques were required to reduce the effects of distortion caused by the 
filtering. The distortion is mainly due to the inability to obtain exact estimates of 
the autocorrelation function for each of the short-time segments used in the short-time 
Wiener filter while at the same time segmenting the data in short enough segments to 
obtain a good approximation of stationarity. We found that in most cases a reasonable 
balance can be obtained which yields useful results. 

Finding a good model of the noise is important for at least two reasons. First, it 
is necessary in order to remove noise from a real signal prior to analysis and synthesis. 
Secondly, high quality noise models are essential to the formation of high quality synthetic 
signals. We found that the AR model gave the best results in terms of simplicity of 
computation, usefulness in the application of pre-whitening, and the best ‘'sounding” 


B. SUGGESTIONS FOR FURTHER STUDY 

The results from this thesis lead to several interesting possibilities for future research. 
These possibilities include: 


53 



1. Training and testing artificial neural networks. We demonstrated that human lis¬ 
teners are typically unable to tell the difference between real signals and synthet¬ 
ically generated signals when the noise is taken into account in the synthesis pro¬ 
cess. However, this may not necessarily be true for artificial neural networks, as 
the filtering process may introduce false clues unnoticeable to humans which could 
adversely affect training of an automatic classifier. The methods described in this 
thesis should be applied to artificial neural networks to determine if the synthesis 
is as effective as it is for human classifiers. 

2. Real-time implementation. The short-time Wiener filter is non-causal and therefore 
not strictly applicable to real-time processing. It could however potentially be 
modified to work in a nearly real-time manner by applying the short-time method 
described in this thesis to segments of data which are perhaps as long as several 
seconds. For example, the filter could be used to remove the noise from two- 
second segments of data. Each two-second segment would then be segmented into 
sequences of perhaps 50 msec and filtered as described in this thesis. The output 
of such a filter would be delayed by two seconds plus the processing time, however 
it could potentially allow a human classifier to work at nearly real-time in a less 
noisy environment. 

3. Nonstationary noise models. The noise models we developed are stationary. This is 
a good approximation for short duration signals but may be inadequate for longer 
synthetic signals. Time-varying models should be examined since they may further 
improve the authenticity of synthetically generated signals. 

4. Non-linear filtering and modeling. Finally, since the data tested to have a Gaussian 
distribution, the use of linear techniques is justified. Non-linear methods however 
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could yield better results in the presence of known non-Gaussian noise sources. 
These would have to be examined on a case-by-case basis 
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